Client Report - What’s in a Name?

Course DS 250

Author

jaison jonnakuti

Show the code
import pandas as pd
import numpy as np
from lets_plot import *

LetsPlot.setup_html(isolated_frame=True)

Project Notes

For Project 1 the answer to each question should include a chart and a written response. The years labels on your charts should not include a comma. At least two of your charts must include reference marks.

Show the code
# Learn morea about Code Cells: https://quarto.org/docs/reference/cells/cells-jupyter.html

# Include and execute your code here
df = pd.read_csv("https://github.com/byuidatascience/data4names/raw/master/data-raw/names_year/names_year.csv")

QUESTION|TASK 1

How does your name at your birth year compare to its use historically?

This section compares the use of the name “jai” in your birth year to its historical usage.The empty graph indicates there are no recorded instances of the name “jai” in the given data. Also the data indicate the name is exceptionally rare in the United States.

Show the code
# label: jai Name Graph
# code-summary: Compare the use of "jai" over time
# fig-cap: "Historical usage of the name 'jai' over time (no data available)"
# fig-align: center

name_jai = df.query("name == 'jai'")

chart_jai = (ggplot(name_jai, aes('year', 'Total')) +
             geom_line(color='blue') +
             ggtitle("Historical Usage of the Name 'jai'") +
             xlab('Year') +
             ylab('Count') +
             geom_hline(yintercept=0, linetype='dotted', color='red') +
             theme(axis_text_x=element_text(angle=45, hjust=1)))

chart_jai.show()

QUESTION|TASK 2

If you talked to someone named Brittany on the phone, what is your guess of his or her age? What ages would you not guess?

Brittany was very popular in the late 1980s and 1990s. I would guess the person is in their 30s or so. would not guess 60+, since Brittany was rarely used that long ago.

Show the code
# label: Brittany Graph
# code-summary:Read and format data
# fig-cap: "Name popularity for 'Brittany' with reference line"
# fig-align: center
# Include and execute your code here


brittany_df = df.query("name == 'Brittany'")

brittany_chart = (
    ggplot(brittany_df, aes('year', 'Total')) + 
    geom_line(color='blue') + 
    ggtitle("Brittany Graph") +
    geom_vline(xintercept=1990, linetype='dashed', color='red') +
    xlab("Year") +
    ylab("Total Count") +
    theme(axis_text_x=element_text(angle=45, hjust=1))
)

brittany_chart.show()

QUESTION|TASK 3

Mary, Martha, Peter, and Paul are all Christian names. From 1920 - 2000, compare the name usage of each of the four names in a single chart. What trends do you notice?

This section compares the usage of “Mary,” “Martha,” “Peter,” and “Paul” between 1920 and 2000. Mary was extremely popular early (1920–1960), then declined. Martha had moderate usage, peaking mid-century, then tapered off. Peter and Paul show steadier usage but also decline after 1970.

Show the code
# label: Biblical Names Graph
# code-summary: Read and format data
# fig-align: center
# Include and execute your code here

biblical_names = ['Mary', 'Martha', 'Peter', 'Paul']
biblical_data = df.query("name in @biblical_names and 1920 <= year <= 2000")

chart_biblical = (ggplot(biblical_data, aes('year', 'Total', color='name')) +
                  geom_line() +
                  ggtitle("Trends of Biblical Names (1920-2000)") +
                  xlab('Year') +
                  ylab('Count') +
                  theme(axis_text_x=element_text(angle=45, hjust=1)))

chart_biblical.show()

QUESTION|TASK 4

Think of a unique name from a famous movie. Plot the usage of that name and see how changes line up with the movie release. Does it look like the movie had an effect on usage?

The name “Peter” is analyzed to see how its popularity was affected by the release of the movie Spider-Man (2002, 2012, 2017) series. No strong effect visible in usage; it has been steadily trending downward.

Show the code
# label: Movie Name Chart
# code-summary: Analyze and plot trends for the name "Peter"
# fig-cap: "Trends for the name 'Peter' over time, including Spider-Man movie release years"
# fig-align: center
# Include and execute your code here

name_peter = df.query("name == 'Peter'")

chart_peter = (ggplot(name_peter, aes('year', 'Total')) +
               geom_line(color='green') +
               geom_vline(xintercept=2002, linetype='dotted', color='red') +
               geom_vline(xintercept=2012, linetype='dotted', color='blue') +
               geom_vline(xintercept=2017, linetype='dotted', color='orange') +
               ggtitle("Trends for the Name 'Peter' with Movie Releases") +
               xlab('Year') +
               ylab('Count') +
               theme(axis_text_x=element_text(angle=45, hjust=1),
                     axis_text_y=element_text(size=8)))

chart_peter.show()

STRETCH QUESTION|TASK 1

Reproduce the chart Elliot using the data from the names_year.csv file.

The name “Elliot” is analyzed similarly to how previous charts were created. This chart highlights the name “Elliot” with key milestones, such as the release of the movie E.T. and subsequent events that influenced its popularity. To recreate this chart, including the vertical reference lines for specific release dates

Show the code
# label: Elliot Chart with Milestones
# code-summary: Analyze the trends for 'Elliot' and highlight key release events
# fig-cap: "Trends for the name 'Elliot' with key movie release milestones"
# fig-align: center

# Create the line chart for the name "Elliot"

name_elliot = df.query("name == 'Elliot' and year >= 1950")

chart_elliot = (ggplot(name_elliot, aes('year', 'Total')) +
                geom_line(color='orange') +
                geom_vline(xintercept=1982, linetype='dashed', color='red') +
                geom_vline(xintercept=1988, linetype='dashed', color='red') +
                geom_vline(xintercept=2002, linetype='dashed', color='red') +
                ggtitle("Trends for the Name 'Elliot' with Movie Milestones") +
                xlab('Year') +
                ylab('Count') +
                theme(axis_text_x=element_text(angle=45, hjust=1),
                      axis_text_y=element_text(size=8)))

chart_elliot.show()